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Abstract 

The purpose of this work is to describe a duahty between a fragmentation associated to 
certain Dirichlet distributions and a natural random coagulation. The dual fragmentation 
and coalescent chains arising in this setting appear in the description of the genealogy of 
Yule processes. 

1 Introduction 

At a naive level, fragmentation and coagulation are inverse phenomena, in that a simple 
time-reversal changes one into the other. However, stochastic models for fragmentation 
and coalescence usually impose strong hypotheses on the dynamics of the processes, such 
as the branching property for fragmentation (distinct fragments evolve independently as 
time passes), and these requirements do not tend to be compatible with time-reversal. 
Thus, in general, the time-reversal of a coalescent process is not a fragmentation process. 

Nonetheless, there are a few special cases in which time-reversal does transform a 
coalescent process into a fragmentation process. Probably the most important example 
was discovered by Pitman JJj; it is related to the so-called cascades of Ruelle and the 
Bolthausen-Sznitman coalescent (Tj, and also has a natural interpretation in terms of the 
genealogy of a remarkable branching process considered by Neveu, see [1| and 

The first purpose of this note is to point out other simple instances of such duality, 
which rely on certain Dirichlet and Poisson-Dirichlet distributions. Then, in the second 
part, we shall show that these examples are related to the genealogy of Yule processes. 
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2 Dual fragmentation and coagulation 



2.1 Some notation 

For every integer n > 1, we consider the simplex 



An := 



<x = {xi , . . . , Xn+i) '■ Xi>Q for every i = 1, . . . , n + 1 and = 1 > . 



It will also be convenient to agree that Aq := {!}. We shall often refer to the coordinates 
xi, . . . , of points X in A„ as masses. 

We recall that the n-dimensional Dirichlet distribution with parameter (ai, . . . , On+i) 
is the probability measure on the simplex A„ with density 

r(«l + --- + «n+l) ai-l a„+i-l 

r(ai)...r(a„+i) ^1 ■■■'^-+1 • 

The special case when «! = ... = := a G ]0, oo[ will have an important role in this 
work; it will be convenient to write Dir„(Q;) for this distribution. We recall the following 
well-known construction: let 71, ... , 7n+i be i.i.d. gamma variables with parameters (a, c). 
Set 7 = 7i + • • • + 7n+i, so that 7 has a gamma distribution with parameters {a{n + 1), c). 
Then the (n + l)-tuple 

(7i/7,--- ,7n+i/7) 

has the distribution Dir„(a) and is independent of 7. 
We also define the (ranked) infinite simplex 

Aoo '■= Ix = (xi, . . .) : xi > X2 > . . . > and Xj = 1 > 

and recall that the Poisson-Dirichlet distribution with parameter > 0, which will be 
denoted by PD(0) in the sequel, is the law of the random sequence 



ai a2 



Eoo 1 v^oo t • 



where ai > 02 > . . . > arc the atoms of a Poisson random measure on ]0, oo[ with inten- 
sity 9y~^e''^dy. We also recall that is independent of Yli^i that the latter has 
the gamma distribution with parameters (^,1). By the celebrated Levy-Ito decomposition 
of subordinators, we may also rephrase this construction as follows: if 7 = {'j{t),t > 0) is 
a standard gamma process and, for each fixed 9 > 0, Si > 62 > ■ ■ ■ denotes the sequence 
of sizes of the jumps of 7 on the time interval [0, 6], then 

Si 62 



has the PD{6) distribution and is independent of j{0). 
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2.2 Two dual random transformations 



We now define two random transformations: 

Fragfc : A„ An+k and Coag^ : A^+k ^ A„ , 

where k, n are integers. 

First, we fix X = (xi, . . . , Xn+i) € A„ and pick an index / G {1, . . . , n + 1} at random 
according to the distribution 

P(/ = i) = Xi, i = l,...,n+l, 

so that xi is a size-biased pick from the sequence x. Let t] = (rji, . . . , %+i) be a random 
variable with values in A^ which is distributed according to Dirfc(l/fc) and independent 
of I. Then we split the Ith mass of x according to rj and we obtain a random variable in 

^n+k- 

Pragfc(x) := (xi, . . . , x/r/i, . . . , rr/r/fc+i, . . . , x„+i) . 

Second, we fix re = (xi, . . . ,Xn+k+i) £ ^n+k and pick an index J G {1, . . . ,n + 1} 
uniformly at random. We merge the k + 1 masses xj, xj+i . . . , xj+fc to form a single mass 
X^^j Xj and leave the other masses unchanged. We obtain a random variable in A„: 

/ J+k 
Coag^,(x) = Xl, . . . , Xj-l, Xi,Xj+k+l, Xn+k+l 

\ i=J 

Remark. Consider the following alternative random coagulation of x = (xi, . . . , Xn+k+i) ^ 
An+k- Pick k+1 indices ii, . . . , ik+i from {1, . . . ,n + k + l} uniformly at random without 
replacement, merge the leave the other masses unchanged and let 

Coag;i.(x) be the sequence obtained by ranking the resulting masses in decreasing order. 
Write also Coag|.(x) for the sequence Coag^,(x) re-arranged in decreasing order. Then if 

^ is exchangeable the pairs (^,Coag^(^)) and (^, Coag^(^)) have the same distribution. 
This remark applies in particular to the case when ^ has the law Dir„_|_fc(l/A:), and can 
thus be combined with forthcoming Proposition Q 

The starting point of this work lies in the observation of a simple relation of duality 
which links these two random transformations via Dirichlet laws. 

Proposition 1. Let k,n >1 be two integers, and two random variables with values 
in An and An+k, respectively. The following assertions are then equivalent: 

(i) ^ has the law Dir„(l/A;) and, conditionally on ^, ^' is distributed as Frag^(.^). 

(ii) ^' has the law Dir„_|_jfc(l/A;) and, conditionally on , is distributed as Coagj^{£,') . 

It has been observed by Kingman |13j that for A; = 1, if ^' is uniformly distributed on 
the simplex A„_|_i (i.e. has the law Dir„_|_i(l)), then Coagi(^') is uniformly distributed on 
An- Clearly, this agrees with our statement. 

Proof: Let 7i, 72, • • • , 7n+i be independent Gamma(l/A;, 1) random variables and set 

n+l . X 

7 = 7j and ^ = ( ^, . . . , ) , 

so that ^ has law Dir„(l/A;) and is independent of 7. Suppose that r/ is a Dirfc(l//c) 
random variable which is independent of the 7j's, and let ^ : — > M be a bounded 
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measurable function. Let / be an index picked at random from {1, . . . , n + 1} according 
to the conditional distribution 



l,...,n + l, 



and denote by Frag;j(^) the random sequence obtained from ^ after the fragmentation of 
its /th mass according to r]. We have 



E($(fi:agfe(e)),/ = z)=E 



^ {{iih)i<i,ii^h: {iih)i>i) 



Now, using the independence of 7 and and the fact that 7 has the law Gamma((n + 
l)//c, 1), we see that the last expression is equal to 



k 



n+l 
k 



E bi^ {{li/l)i<i,liV/7, ili/l)i>i)] 

{ll)l<i XT] 



ill 



'l)l>i 



1 



-E [ i 

« + 1 Jo \x + Ej^i 7j ' a; + EjVi 7j ' a; + Ej^i Ij I r(l/A;) 



X 



1 



n + l 



-E 



{li)i<i 



{li)i>i 



i + EjVi 7i ' i + Ejy» 7j ' 7' + Ejyi 7i 



where 7' ~ Gamma((A; + l)/fc, 1), independently of r/ and But then 7'?] is a 

collection of A; + 1 independent Gamma(l/A;, 1) random variables, so Frag^(^) has the law 
Dir„_|_fc(l/A;) and is independent of the random index 1 which is uniformly distributed on 
{l,...,n + l}. Since we can recover ^ from Prag;i.(^) and I by an obvious coagulation, 
this completes the proof. □ 

Next we turn our attention to the infinite ranked simplex and define two random 
transformations, Frag^^ : Aqo Aoo and Coag^ : Aqo — > Aqo, where a G [0,1] is 
some parameter. The fragmentation transformation on the infinite simplex simply mim- 
ics that on the finite simplex; in this direction, recall that the Poisson-Dirichlet PD(1) 
arises as the weak limit as A; —> 00 of sequence of Dirfc(l/A;) variables after obvious 
re-ordering. More precisely, given x = (xi,...) G Aqo, we pick a mass x/ at random 
by size-biased sampling and split xi using an independent variable r\ = (r/i,...) with 
law PD(1). In other words, Frag(^(x) is the ranked sequence with unordered terms 
xi, . . . ,xi-i,xir]i,xir]2, . . . ,x/+i, . . .. 

Next, consider a sequence Ui, U2, ... of i.i.d. uniform random variables and a € [0, 1]. 
Starting again from some fixed x G Aqo, we merge the masses Xj for which Ui < a into a 
single mass and leave the others unchanged. We denote by Coag„(x) the random sequence 
obtained by putting the resulting masses in decreasing order. We then have the following 
analogue of Proposition ^ which is reminiscent of Corollary 13 of Pitman '17]. 

Proposition 2. Let be two random variables with values in Aqo- For every 6 > 0, 
the following assertions are equivalent: 

(i) ^ has the law PD(0) and, conditionally on ^, ^' is distributed as Fragoo(^). 

(a) ^' has the law PD(^+1) and, conditionally on ^ is distributed as Coag;^/(g_|_x)(^')- 

Proof: Let 7 = {'y{t),t > 0) be a standard gamma process and set 



A = 7((^ + l)i)/7(^ + l), 
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for < t < 1, so that {Dt,0 < t < 1) is a Dirichlet process of parameter 6 + 1. (The 
vector of ordered jumps of this Dirichlet process has the PD(0 + 1) distribution.) Consider 
the fohowing alternative way of thinking of the random coagulation operator Coagx/(g^_i): 
pick a point V uniformly in [0, 1] and define a new process (D^, < t < 1) by 



Det/{e+i) iit<V 
_^(i+6»t)/(e+i) if t > ^• 



As the times of the jumps of D are uniformly distributed on [0, 1], this picks a proportion 
1/(^ + 1) of them and coalesces them into a single jump (say 13* = -D(i+6)v)/(6i+i) — 
-D6'y/(6»+i)) ^- Let /3i > /32 > • • • > be the sequence of other jumps of D' and 
Ui,U2, ■ ■ ■ the corresponding jump times. Let P[ > P'2 > ■ ■ ■ > he the sequence of jumps 
of D in the interval [9V/{e + 1), (1 + 9V)/{e + 1)], so that (3* = J^Zi (^i- We wish to 
show that D' is a Dirichlet process with parameter 6, so that the vector (/?*, /32, • • •) of 
its jumps (re-arranged in the decreasing order) has the PD(0) distribution. We will also 
show that the mass (5* resulting from the coalescence constitutes a size-biased pick from 
this vector. 



Let 



7(t) if t < ^6* 

7(t + 1) - (7(1/0 + 1) - -live)) ■iive<t< 

j^(t) =-f{Ve + t) - -five) forO<t<L 



Then 7"^ and 7^ are independent processes with 7^ = (7(t), < t < 9) and 7^ = (7(i), < 
t < 1), independently of V. Write Si > 62 > ■ ■ ■ ioi the ordered sequence of jumps of 7^ 
and Ti,T2, . . . for the corresponding times of these jumps. Write d'l > 62 > ■ ■ ■ for the 
ordered sequence of jumps of 7^. Then 

(i) Ui = Ti/e, U2 = T2/e, ... are i.i.d. U[0, 1], 

(ii) P* = 7^(1)/7(1 + e) and so has a Beta(6', 1) distribution, 

(iii) ^(/?i, /32, • • •) = i^'iJ ^'21 ■ ■ •) ^"^^ so has the PD(1) distribution, 

(iv) Y^p* (A, /?2; • • •) = ^i'(g) i^ii ^2, ■ ■ ■) and so has the PD(6') distribution. 

Furthermore, the random variables in (i) to (iv) above are independent. The fact that (3* 
is a size-biased pick from (/3* , /3i , /32 , • • •) and the PD(0) distribution of the latter follow 
from (i) and (iii) and the stick-breaking scheme (see, for instance, Definition 1 in Pitman 
and Yor |19j). That D' is a Dirichlet process of parameter e then follows from (iv) and 
the independence. 

The coagulation operator used here can be re-phrased as follows: starting with x € 
Aoo, take a sequence V, Vi, V2, ... of i.i.d. U[0, 1] random variables, merge the masses Xi 
for which Vi € [eV/ie + 1), (l-|-0y)/(0-|-l)], leave the other masses unchanged and, finally, 
rank the resulting sequence in decreasing order. Call this operator Coag^/(-5i_|_]^-). Then it 
is clear that whenever ^' is a random exchangeable sequence in Aqo, (^', Coag]^^(g^;^)(,^')) 

and (^', Coagx/(e+i) (O) have the same distribution. Our claim follows now readily from 
these results. □ 

Remark. It may be interesting to check Proposition |21 as follows. Consider Poisson 
random measure M on (0, 00) with intensity ex~^e~^dx. Let ai,02, ... be the atoms of 
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M in decreasing order, so that 



ai 



a2 



has distribution FD{6), independently of Y^'jLiO-j- Let ry ~ PD(1), independently of M 
and suppose that $ : Aqo — >■ M is any symmetric bounded measurable function. Then if 
^ ~ PD(0), using independence we have that 



E[$(R:ag^(e))] 



1 



E 



Eoo 
.7=1 



an 



E 



i=l 



By the Palm formula, this is equal to 



XT] 



oo 

1=1 



E 



a'?7 



oo 

1=1 



oo 



Ox ^e ""dx 



where a' ~ Exp(l), independently of M and r/. But then o'r/ has the distribution of 
the atoms of a Poisson random measure with intensity x~'^e~^dx arranged in decreasing 
order and so we see that taking these atoms together with those of M, we obtain a Poisson 
random measure of intensity {0 + l)x~^e~^dx. Hence, Fragoo(^) has the law PD(0 + 1). 



2.3 Dual fragmentation and coagulation chains 

The dual fragmentation and coagulation operators that were defined in the preceding 
section incite us to introduce Markov fragmentation and coagulation chains in duality by 
time-reversal. Specifically, we consider for each integer > 1 a chain 

XW(0),X(^)(1),XW(2),... , 

where X^^\n) is a random variable with values in A^j^ (in particular X^^\Q) = 1), and 
the conditional distribution of X^^\n + 1) given X^^\n) = x is the law of Frag;j(a;). We 
deduce from Proposition ^ by induction that for each n, X^^\n) has the distribution 
Dirnfc(l/A:). The time-reversed coagulation chain 

. . . , X^^\n + 1), X^'^) (n), . . . , X^^\l),X'^^^ (0) 

is also Markov; more precisely, the conditional distribution of X*^'^-' (n) given 

X is the law of Coag;j(rE). Note that for /c = 1, this has the distribution of the jump chain 

of Kingman's coalescent 

Analogously, for A; = oo, we can define a Markov fragmentation chain on Aqo, 

X(°°)(0),X(°°)(1),X(°°)(2),... , 

such that the conditional distribution of X^^\n + 1) given = X is the law 

of Fragoo(x). We deduce by induction from Proposition [2 that for every 6* > 0, if the 
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distribution of the initial state X^°°\o) is PB{9) then, for every integer n, X(°°)(n) has 
the distribution PD{9 + n). Moreover, in this case, the time-reversed coagulation chain 

. . . , + 1), (n), . . . , (0) 

is also Markov; more precisely, the conditional distribution X^°^' (n) given 
X is the law of Coagi/(„_|_i_|_5i)(x). 

Remarks, (a) Recall that the parameter 9 can be recovered from a sample of a PD(^) 
random variable as follows: 

9 = lim — max |n : > e| . 

£^0+ log 1/e 

This shows that the above description for the reversed coagulation chain is indeed Marko- 
vian. 

(b) There is simple representation for the k = oo fragmentation chain in terms of 
compound bridges with exchangeable increments which is inspired by [Sj. Let Uq,Ui, . . . 
be a sequence of independent uniform variables on [0,1]. For each n, we consider the 
elementary bridge 6„ : [0, 1] [0, 1] defined by 

Tl 1 

Then is is easy to check that for every n € N, the sequence 6„ o bn+i o • • • o 6„,+j converges 
pointwise almost surely as z ^ oo to a bridge with exchangeable increments Bn which has 
no drift and infinitely many jumps a.s. If we write Pn S A„ for the sequence of the sizes 
of the jumps of Bn ranked the decreasing order, then the chain {(5^^ E N) has the same 
law as X^^\ We refer to for the necessary technical background. 

3 The genealogy of Yule processes 

We shall now show that the dual fragmentation and coagulation chains which we intro- 
duced in the preceding section are naturally connected to the genealogy of Yule processes. 

3.1 Discrete setting 

For every integer A; > 1, we write 

y(fc) ^ (Y};''\t> for the Yule process started from 

(k) (k) 

Yq = 1: gives the number of individuals alive a time t in a branching process in 

which each individual lives for an exponential time of parameter 1 and gives birth at its 
death to A; + 1 children, which then evolve independently of one another according to the 
same rules as their parent. We agree to label each child of an individual by an integer 
in {!,... ,A: + 1}, which allows us to order individuals at any generation in a consistent 
way: given two distinct individuals, we may consider their most recent common ancestor. 
Plainly, two different children of this ancestor are ancestors of exactly one of these two 
individuals, and the labelling of the children of the most recent common ancestor induces 
the order of the individuals. 

Lemma 3. The process (^ex.];){ — kt)Y^'^\t is a uniformly integrable martingale and 

its limit W^''^ has the G&mma(l/k,l/k) distribution. 
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Proof: A similar limit result is stated in Athreya & Ney on page 130; however, the 
limiting distribution given there is incorrect and so we shall provide here a detailed proof. 
The martingale property is classical, so we focus on the distribution of the limit W^^\ 
Define 

Ms) :=E(s^ 
The backward equation implies that 

This equation has solution 



-l/k 



Hence, for 6* < 0, 

E (exp ( ee-^'Y}''^)) = exp ( 06"'=*) e"* (l - (l - e''^*) exp ( ^fee"^* 



e^*exp(-0te-^M -e'=* + l 



-l/k 



and when t ^ oo, this quantity converges to 



(1 - k9)-'/' = (- 



l/k 



l/k 



/k- 

which is the moment generating function of a gamma random variable with parameters 
{l/k, l/k). □ 

We think of T^C^^ as the size of the terminal population. For every t > 0, by application 
of the branching property at time t, we may decompose the terminal population into sub- 
populations having the same ancestor at time t. Specifically, 



where W^^\t) is the size of the terminal sub-population descending from the ith individual 

(k) (k) / 

in the population at time t. Observe that conditionally on 1^ , the variables (t) are 
independent and all have the same law as e~''^W^''\ 
Finally, we define the genealogical process 

Qik) ^ (G'(^')(t),t > 0) associated to Y^''^ 

by 

G^'Ht) = (wi'\t),...,wl^.,it)^ ■ 

The genealogical structure of the Yule process can be described in terms of the frag- 
mentation chain X^^^ of Section [2.31 as follows. 

Theorem 4. Let N = {Nt,t > 0) be a standard Poisson process which is independent of 
the chain X^''\ Then for each w > 0, the compound chain 

(wX^''\N^t),t>0 
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has the same law as the time-changed process 

(g(^) (^ilog(l + A;t)),t>o) 

conditioned on W^''^ = w. 

Remark. Theorem I of Kendall JJ^ states that given W'^'^\ (^ilg^(i+t/VF(i))' ^ — ^) ^ 
Poisson process with unit parameter. This is clearly an aspect of Theorem 0] Moreover, 
on page 130 of Athreya & Ney PP, it is suggested that no generalization of Kendall's 
result to a more general continuous-time Markov branching process is known; Theorem |1] 
constitutes a small such generalization. 

Proof: Set T(t) := ^ log(l+/ct) and let T be the time of the first birth in the Yule process, 
which is also the time of the first dislocation of G^^\ The k + 1 fragments of G^^\T) 
can be written as e~^'^Zi, . . . , e~^'^Zk-\-i where, by the branching property, Zi, . . . , Zk+i 
are i.i.d. Gamma(l//c, random variables, independent of T which is Exp(l). Define 
a change of variables by 

5 = r-i(T) = (e*^'^-l)/A: 

C/i=e-'^^Zi, C/fc = e-*^^Zfc, = e-^^(Zi + • • • + 

It is straightforward to see that the joint density of (T, Zi, . . . , Z^+i) is 
/(t,zi,...,2;fc+i) 

= e-*r(l/A:)-('=+i)(l/A:)('=+i)/^-(^i^2 • • • Zfc+i)-^'^-^)/' exp(-(zi + • ■ ■ + Zk+i)/k) 

and so the joint density of (S*, C/i, . . . , Uk, W) is 

g{s, ui,...,Uk,w) = we-""' ■ {l/k)T{l/ky^w'^/^{uiU2 ■ . . Uk{w - m Uk))^/''^^ 

■ (l/A:)i/'=r(l/A:)"iu;-(^-i)/'= exp(-u'/A;). 

Hence, W ~ Gamma(l//i;, 1/A;) (as we already knew) and, conditional on W = w, we 
have S ~ Exp(tt;) and (C/i, U2, . . . ,Uk,W — Ui — ■ ■ ■ — Uk) ~ wT)\ik{^/k) independently of 
S. Thus, the first dislocation has the correct dynamics. But by the branching property, 
subsequent dislocations are independent for different sub-populations and the total rate 
of fragmentation is always w. Hence result. □ 

In the terminology of j^. Theorem 0] states that the time-changed genealogical process 
G'-'^^ or is a self-similar fragmentation with index 1, dislocation law Dirfc(l//c) and erosion 
coefficient 0. It may be interesting to observe that in the special case A; = 1, this result 
can also be derived as follows. 

Consider a real Brownian motion B started from 1 and killed when it reaches (at time 
To = inf{t > : Bt = 0}). For every u G [0, 1[, let Yu denote the number of excursions of 
B away from 1 which go below level u. Then {Y^^^^^^_^^)q<u<i is a version of {Yu)o<u<i- 
To see this, let us consider the evolution of Y. Firstly, Yq = 1, corresponding to the 
single excursion below 1 which reaches 0. Let D = sup{t < Tq : Bt = 1}, the starting 
time of the final excursion which hits 0, let U = info<t<D Bt be the level reached by the 
deepest excursion below 1 before D and let Tjj be the time at which it is reached. Then, 
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by Williams' path decomposition theorem (Theorem VII. 4. 9 of Revuz and Yor 20 ), U is 
distributed uniformly on [0, 1[ and, conditional on U, {Bt)o<t<Tu is a Brownian motion 
started at 1 and stopped when it first hits level U. By symmetry, {B]:)-t)o<t<D-Tu is 
another independent Brownian motion started at 1 and stopped when it first hits level 
U. Thus, Yu is equal to 1 on [0,C/[, Yu = 2 and {Yu+v)o<v<i~u evolves as the sum 
of two independent processes which are the same as Y except that the times until the 
first jumps are now uniform on [0, ?7[ rather than on [0, 1[. (This is Theorem 8 of Le 
Gall ^ni) repeated here for completeness.) Time-changing Y^^^ with u — > — log(l — u) 
means that its exponential inter-jump times become uniform and so we do, indeed, have 

A more elegant way of expressing the preceding argument is to say that the Brownian 
path encodes a continuous-state branching process with quadratic branching mechanism. 
The local time at level 1, Llp^, satisfies 

L^r^ = \\m 2{l- u)Yu. 

In this context, \L}j^^^ corresponds to the size of the population at time 1 in the continuous- 
state branching process generated by a single ancestor conditioned to have descendents 
up to time 1. The so-called reduced tree associated with the population at time 1 is 
described up to the deterministic time-change u — > — log(l — v) by the Yule process Y^^\ 
See, for instance. Section 2.7 in Duquesne and Le Gall [H^, and Fleischmann and Siegmund- 
Schultze Note that the well-known fact that \Lj.^ has an exponential distribution with 
mean 1 (Proposition VI. 4. 6 of Revuz and Yor [20]) gives another derivation of the limiting 
distribution in Lemma |21 since 

11^(1) = lim e-*y/^^ = lim (1 - u)Y^'^} , = -LI. 

J- t ^ 1 ^ ' — log(l — u) 2 



It is known from excursion theory that in the scale of the local time at level 1, the 
rate of excursions of B away from 1 which reach level u € ]0, 1[ but do not exceed u — du 
is (1 — u)~'^du. Note that the map s — > 1 — j^^^ from R_|_ to [0, 1[ has inverse u —>■ — 1 
and, thus, transforms Lebesgue measure on R-|- into the measure (1 — u)^^du on [0, 1[. 
Suppose that we split the local time at level 1 according to the occurrence of excursions 
exceeding level u, so that we obtain the sequence 



W{u) = (Wiiu),...,WyJu)^ 



where W{u) is the sequence of the increments of the local time at level 1 on the maximal 
time intervals such that at the beginning and end of each interval i? is at 1 and during 
the interval remains above level u. Then it follows easily that the time-changed process 
(w ^1 — , s > 0^ is a fragmentation in which each mass, say x, splits at rate x 
into xU and x{l — U) where U is uniform. In other words, conditionally on ^Dp^ = w, 
the process (w ~ i s > 0^ is distributed as the compound fragmentation chain 

{wX^'^\Ny,s),s > O), where N is an independent standard Poisson process. 

Finally, the composition of the two time-changes which appear in this analysis yields 



log( 1- ( 1-^ ) ) =log(l + s), sGR^ 
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and so we recover Theorem 0] in the special case k = 1. Unfortunately, it does not seem 
that there are similar interpretations for k >2. 

Corollary 5. We have that 

is a time- homogeneous Markov coalescent process which is independent of W^^'^ . For any 
'n>\, given that it is in state x € A^^, it waits an exponential time of parameter n and 
then jumps to a variable distributed as Coag^(x), independently of the exponential time. 

Note that the case A; = 1 of this result gives a variation of Kingman's coalescent. The 
jump-chains are identical, as we have already noted, but here the rate of coalescence of two 
blocks depends on the total number of blocks present, whereas in Kingman's coalescent 
it does not. 

Proof: Firstly, we note that by Theorem 0J 

(ilog(i + te-/M'«)),teR) 

has the same law as 

(x«(iV,-0,teR) 

and so we will work with the latter process instead. The k = 1 case is essentially treated 
in [21 and the proof proceeds in the same way here. The jump chain clearly behaves in the 
correct manner and so it remains to check that the inter-jump times are as claimed. Let 
< Ti < r2 < . . . be the jump times of (iVt)t>o. Then the first instant that X'^''\N^-t) 
has exactly nk + 1 terms is 

inf {t G R : N^-t = n} = - logT^+i. 

The sequence of inter-jump times is 

. . . , log Tn+i - log T„, log r„, - log r„_i , . . . , log r2 - log Ti 

and it is easily shown that this is a sequence of independent exponential random variables 
with parameters 

. . . , n, n — 1, . . . , 1 

respectively. □ 

3.2 Continuous setting 

Continuous-state branching processes (or CSBP's) were introduced by Lamperti jl41ll5j as 
limits of rescaled branching processes. Typically, a CSBP is a time-homogeneous Markov 
process with values in R^, 

Z = {Z{t,a), f > and a > 0) , 

(where the parameter t refers to time and the parameter a to the starting point i.e. 
Z(0, a) = a a.s.) which fulfils the branching property: the path-valued process {Z(-, x),x > 



11 



0) has independent and stationary increments. In particular, if Z{-,y) is an independent 
copy of Z{-,y), then Z{-,x) + Z{-,y) has the law of Z{-,x + y). There is a simple re- 
lation connecting CSBP's and Bochner's subordination for subordinators which enables 
us to define their genealogy; we refer the interested reader to j4j for heuristics, detailed 
arguments etc. 

We call a continuous state Yule process a CSBP 

Y = {Y{t,a), i > and a > 0) , 

which evolves as follows: for each a > 0, the process Y{-,a) waits an exponential time 
with parameter a and then jumps to a + 1. It then evolves independently as if it had been 
started in state a + 1. In terms of the genealogy, the sub-population of size 1 which is born 
at a jump time has a parent which is chosen uniformly at random from the population 
present before the jump. Note that this genealogy is easy to describe in a consistent 
manner for different values a of the starting population. 

It is immediate that for an integer starting point a G N, the process {Y{t,a)^t > 0) 
is a Yule process Y'^^^ with 2 offspring, as considered in the preceding section. However, 
we stress that its genealogy is not the same as that of y(^) as we are dealing with a 
continuous population in the first case and a discrete population in the second. 

We have the following analogue of Lemma 121 

Lemma 6. For every a > 0, the process (e~*y(t, a), t > O) is a uniformly integrable 
martingale. Its limit, say 7(a), viewed as a process in the variable a, has the same finite 
dimensional laws as a standard gamma process. 

Proof: For a = 1, we see from Lemma|21and the identity in distribution Y{-, 1) = Y^^\-) 
that (e-*y(t,l),i > O) is a uniformly integrable martingale and that its limit has the 
standard exponential distribution. The proof is easily completed by an appeal to the 
branching property. □ 

Remark. The limiting distribution in Lemma El is essentially a corollary of Theorem 3 
of Grey [TOl. 

Just as in the preceding section, we think of 7(a) as the size of the terminal population 
when the initial population has size a. We can express 7(a) as 

7(«) = ^^b, 

b<a 

where 5 := {6b, b > 0) is the jump process of 7, which corresponds to decomposing the 
terminal population into sub-populations having the same ancestor at the initial time. 
We write G(0, a) for the sequence of the jumps of 7 on [0, a], ranked in decreasing order, 
and we deduce from LemmajHlthat conditionally on 7(a) = g, G{0,a)/g has distribution 
PD(a). 

More generally, by the branching property, we can decompose the terminal population 
into sub-populations having the same ancestor at any given time t. This gives 

7(a) = E 

b<Y{t,a) 

where <5(*) := (<5[*\ 6 > 0) is the jump process of a standard gamma process 7*^^ which is 
independent of the Yule process up to time t, {Y{s,c), s £ [0,t] and c > 0). This enables 
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us to define for each a > the genealogical process associated to a Yule process Y{-,a), 

G{;a) = {G{t,a),t>0) , 

where e^G{t,a) is the ranked sequence of the sizes of the jumps of the subordinator 7*^*) 
on the interval [0, y(t, a)]. 

An easy variation of the arguments for the proof of Theorem 0] shows that the ge- 
nealogical structure of the Yule process can be described in terms of the fragmentation 
chain X^°°'i of Section [2.31 as follows. 

Theorem 7. Fix a,g > and let the chain X^°°'^ have initial distribution PD(a). In- 
troduce a standard Poisson process, N = {Nt,t > 0), which is independent of the chain 
X^°°h Then the compound chain 

[gX^^\Ngt),t>0) 
has the same law as the time-changed process 

(G(log(l + t),a),t >0) 

conditioned on 7(a) = g. 

Likewise, the analogue of Corollary El is as follows. 
Corollary 8. Fix a > 0. Then 

(log(l + e-77(a)), a) , t G 

is a time-homogeneous Markov coalescent process which is independent of ^{a). Suppose 
that it is in state x € A^q and recall Remark (a) of Section \2. 'A Then if 

lim — maxji : Xj > e| = n + a, 

e^o+logl/e 

the process waits an exponential time of parameter n and then jumps to a variable dis- 
tributed as Coagi/(„_,_a)(x), independently of the exponential time. 
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